Automatic structuring and retrieval of large text files

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic structuring of text files 1

SUMMARY In many practical information retrieval situations, it is necessary to process heterogeneous text databases that vary greatly in scope and coverage, and deal with many different subjects. In such an environment it is important to provide flexible access to individual text pieces, and to structure the collection so that related text elements are identified and appropriately linked. Metho...

متن کامل

Automatic Text Decomposition and Structuring

Sophisticated text similarity measurements are used to determine relationships between natural-language texts and text segments. The resulting linked hypertext maps are used to identify different text types and text structures, leading to improved text access and utilization. Examples of text decomposition are given for expository and non-expository texts. The vector processing model of retriev...

متن کامل

Economical Inversion of Large Text Files

To provide keyword-based access to a large text file it is usually necessary to invert the file and create an inverted index that storeso for each word in the file, the paragraph or sentence numbers in which that word occurs. Inverting alarge file using traditional techniques may take as much temporary disk space as is occupied by the file itself, and consume a great deal of cpu time. Here we d...

متن کامل

Automatic Text Categorization and Its Application to Text Retrieval

ÐWe develop an automatic text categorization approach and investigate its application to text retrieval. The categorization approach is derived from a combination of a learning paradigm known as instance-based learning and an advanced document retrieval technique known as retrieval feedback. We demonstrate the effectiveness of our categorization approach using two realworld document collections...

متن کامل

Improving the Automatic Retrieval of Text Documents

This paper reports on a statistical stemming algorithm based on link analysis. Considering that a word is formed by a prefix (stem) and a suffix, the key idea is that the interlinked prefixes and suffixes form a community of sub-strings. Thus, discovering these communities means searching for the best word splits that give the best word stems. The algorithm has been used in our participation in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Communications of the ACM

سال: 1994

ISSN: 0001-0782,1557-7317

DOI: 10.1145/175235.175243